Real-time Bayesian anomaly detection in streaming environmental data
نویسندگان
چکیده
[1] With large volumes of data arriving in near real time from environmental sensors, there is a need for automated detection of anomalous data caused by sensor or transmission errors or by infrequent system behaviors. This study develops and evaluates three automated anomaly detection methods using dynamic Bayesian networks (DBNs), which perform fast, incremental evaluation of data as they become available, scale to large quantities of data, and require no a priori information regarding process variables or types of anomalies that may be encountered. This study investigates these methods’ abilities to identify anomalies in eight meteorological data streams from Corpus Christi, Texas. The results indicate that DBN-based detectors, using either robust Kalman filtering or Rao-Blackwellized particle filtering, outperform a DBN-based detector using Kalman filtering, with the former having false positive/negative rates of less than 2%. These methods were successful at identifying data anomalies caused by two real events: a sensor failure and a large storm.
منابع مشابه
Real-Time Anomaly Detection for Streaming Analytics
Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarch...
متن کاملOn the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data
Ever growing volume and velocity of data coupled with decreasing attention span of end users underscore the critical need for real-time analytics. In this regard, anomaly detection plays a key role as an application as well as a means to verify data fidelity. Although the subject of anomaly detection has been researched for over 100 years in a multitude of disciplines such as, but not limited t...
متن کاملStochastic Online Anomaly Analysis for Streaming Time Series
Identifying patterns in time series that exhibit anomalous behavior is of increasing importance in many domains, such as financial and Web data analysis. In real applications, time series data often arrive continuously, and usually only a single scan is allowed through the data. Batch learning and retrospective segmentation methods would not be well applicable to such scenarios. In this paper, ...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملReal-time Bayesian Anomaly Detection for Environmental Sensor Data
Recent advances in sensor technology are facilitating the deployment of sensors into the environment that can produce measurements at high spatial and/or temporal resolutions. Not only can these data be used to better characterize systems for improved modeling, but they can also be used to produce better understandings of the mechanisms of environmental processes. One such use of these data is ...
متن کامل